NIP - An Imperfection Processor to Data Mining datasets

نویسندگان

  • José Manuel Cadenas
  • M. Carmen Garrido
  • Raquel Martínez
چکیده

Every day there are more techniques that can work with low quality data. As a result, issues related to data quality have become more crucial and have consumed a majority of the time and budget of data mining projects. One problem for researchers is the lack of low quality data in order to test their techniques with this data type. Also, as far as we know, there is no software tool focused on the create/manage low quality datasets which treats, in the widest possible way, the low quality data and helps us to create repositories with low quality datasets for testing and comparison of data mining techniques and algorithms. For this reason, we present in this paper a software tool which can create/manage low quality datasets. Among other things, the tool can transform a dataset by adding low quality data, removing and replacing any data, constructing a fuzzy partition of the attributes, etc. It also allows different input/output formats of the dataset.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Guest Editorial: Special Issue on Software for Soft Computing

The term Soft Computing refers to a family of techniques (Fuzzy Logic, Neuro-computing, Probabilistic Reasoning, Evolutionary Computation, etc.) endowed with the ability to work in a cooperative way, taking profit from the main advantages of each other, thus tackling with complex real-world problems really hard to solve otherwise [6]. In the last years, many software tools have been developed f...

متن کامل

Velocity Inversion with an Iterative Normal Incidence Point (NIP) Wave Tomography with Model-Based Common Diffraction Surface (CDS) Stack

Normal Incidence Point (NIP) wave tomography inversion has been recently developed to generate a velocity model using Common Reflection Surface (CRS) attributes, which is called the kinematic wavefield attribute. In this paper, we propose to use the model based Common Diffraction Surface (CDS) stack method attributes instead of data driven Common Reflection Surface attributes as an input data p...

متن کامل

MINING FUZZY TEMPORAL ITEMSETS WITHIN VARIOUS TIME INTERVALS IN QUANTITATIVE DATASETS

This research aims at proposing a new method for discovering frequent temporal itemsets in continuous subsets of a dataset with quantitative transactions. It is important to note that although these temporal itemsets may have relatively high textit{support} or occurrence within particular time intervals, they do not necessarily get similar textit{support} across the whole dataset, which makes i...

متن کامل

An Improved SSPCO Optimization Algorithm for Solve of the Clustering Problem

Swarm Intelligence (SI) is an innovative artificial intelligence technique for solving complex optimization problems. Data clustering is the process of grouping data into a number of clusters. The goal of data clustering is to make the data in the same cluster share a high degree of similarity while being very dissimilar to data from other clusters. Clustering algorithms have been applied to a ...

متن کامل

An Improved SSPCO Optimization Algorithm for Solve of the Clustering Problem

Swarm Intelligence (SI) is an innovative artificial intelligence technique for solving complex optimization problems. Data clustering is the process of grouping data into a number of clusters. The goal of data clustering is to make the data in the same cluster share a high degree of similarity while being very dissimilar to data from other clusters. Clustering algorithms have been applied to a ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Int. J. Computational Intelligence Systems

دوره 6  شماره 

صفحات  -

تاریخ انتشار 2013